Introduction - Gas
Station Data
In this analysis, we will be looking at gas station data which
consists of 72,798 records and 30 feature variables with 1 binary
outcome variable. A map was created based on a random sample of 500 gas
stations in the US.
Data Preparation
Importing the data set:
poc<-read.csv(url("https://melaniemiller1.github.io/STA533/w07-HW/POC.csv"))
Take random sample of 500 gas stations in the US:
sample500<- poc[sample(nrow(poc), 500), ]
Gas Stations in the US
- leaflet
label.msg <- ~paste(sample500$ADDRESS,",", sample500$county, sample500$STATE,",",sample500$ZIPnew)
leaflet(sample500) %>%
addTiles() %>%
setView(lng=mean(sample500$xcoord), lat=mean(sample500$ycoord), zoom = 14) %>%
addRectangles(
lng1 = min(sample500$xcoord), lat1 = min(sample500$ycoord),
lng2 = max(sample500$xcoord), lat2 = max(sample500$ycoord),
fillColor = "transparent"
) %>%
fitBounds(
lng1 = min(sample500$xcoord), lat1 = min(sample500$ycoord),
lng2 = max(sample500$xcoord), lat2 = max(sample500$ycoord) ) %>%
addMarkers(~xcoord, ~ycoord, label = label.msg)
From this plot of a random sample of gas stations in the US, the
sample shows that there is a higher proportion of gas stations in the
east coast. There is a lot of sparse space in the midwest, while there
is hardly any areas on the east coast without any gas stations.
Gas Stations in the US
- plotly
#library(plotly)
g <- list( scope = 'usa',
projection = list(type = 'albers usa'),
showland = TRUE,
landcolor = toRGB("gray95"),
subunitcolor = toRGB("gray85"),
countrycolor = toRGB("gray85"),
countrywidth = 0.5,
subunitwidth = 0.5
)
###
fig <- plot_geo(sample500, lat = ~ycoord, lon = ~xcoord) %>%
add_markers( text = ~paste(STATE, county, ADDRESS, ZIPnew,
sep = "<br>"),
hoverinfo = "text") %>%
layout( title = 'Gas Stations in the United States',
geo = g )
fig
Here is another example of plotting the same data in a different
mapping format.
Philly Crime Data
In this analysis, we will be looking at Philly Crime data which
consists of 15,520 records and 18 variables and contains crime cases
since 2015.A map was created based on a subset of data only in 2023.
Data Preparation
Importing Philly Crime Since 2015 data
phillycrime<-read.csv(url("https://melaniemiller1.github.io/STA533/w07-HW/PhillyCrimeSince2015.csv"))
Extract information of year from the variable date and then add the
new variable year to the data set
year<- format(as.Date(phillycrime$date, format="%m/%d/%Y"),"%Y")
phillycrime<- cbind(phillycrime, year)
Subset only containing 2023 data
philly2023<- phillycrime %>%
filter(year == 2023)
Map of Crime in Philly
in 2023
pal <- c("orange", "navy")
pal[which(philly2023$fatal=="Nonfatal")] <- "navy"
pal[which(philly2023$fatal=="Fatal")] <- "orange"
label.msg <- paste("Neighborhood:", philly2023$neighborhood,
"<br> Race:",philly2023$race)
leaflet(philly2023) %>%
addTiles() %>%
setView(lng=-75.1652, lat=39.9526, zoom = 10.5) %>%
addCircleMarkers(
~lng,
~lat,
color = pal,
stroke = FALSE,
fillOpacity = 0.4,
label = ~paste("Neighborhood:", neighborhood,
"Sex:", sex,
"Race:", race,
"Age:", age)) %>%
addLegend(position = "bottomright",
colors = c("orange", "navy"),
labels= c("Fatal", "Nonfatal"),
title= "Fatal",
opacity = 0.4)
This map shows the crime in Philly in 2023. The navy points are
nonfatal crimes, while the orange are fatal crimes. BY looking at the
map visually, it appears there are more nonfatal crimes in 2023, however
there are still many prominent fatal crimes.
---
title: "STA 533 Homework 7"
author: "Melanie Miller"
date: "West Chester University"
output: 
  html_document:
    toc: yes
    toc_float: yes
    number_sections: yes
    toc_collapsed: yes
    code_folding: hide
    code_download: yes
    smooth_scroll: true
    theme: lumen
editor_options:
  chunk_output_type: inline
---


<style type="text/css">

/* Table of content - navigation */
div#TOC li {
    list-style:none;
    background-color:lightgray;
    background-image:none;
    background-repeat:none;
    background-position:0;
    font-family: Arial, Helvetica, sans-serif;
    color: #780c0c;
}


/* Title fonts */
h1.title {
  font-size: 24px;
  color: darkblue;
  text-align: center;
  font-family: Arial, Helvetica, sans-serif;
  font-variant-caps: normal;
}
h4.author { 
  font-size: 18px;
  font-family: Arial, Helvetica, sans-serif;
  color: navy;
  text-align: center;
}
h4.date { 
  font-size: 18px;
  font-family: Arial, Helvetica, sans-serif;
  color: darkblue;
  text-align: center;
}

/* Section headers */
h1 {
    font-size: 22px;
    font-family: "Times New Roman", Times, serif;
    color: darkred;
    text-align: left;
}

h2 {
    font-size: 18px;
    font-family: "Times New Roman", Times, serif;
    color: navy;
    text-align: left;
}

h3 { 
    font-size: 15px;
    font-family: "Times New Roman", Times, serif;
    color: darkred;
    text-align: left;
}

h4 {
    font-size: 18px;
    font-family: "Times New Roman", Times, serif;
    color: darkred;
    text-align: left;
}
</style>


```{r setup, include=FALSE}
# code chunk specifies whether the R code, warnings, and output 
# will be included in the output files.
if (!require("tidyverse")) {
   install.packages("tidyverse")
   library(tidyverse)
}
if (!require("knitr")) {
   install.packages("knitr")
   library(knitr)
}
if (!require("jpeg")) {
   install.packages("jpeg", dependencies = TRUE)
   library(jpeg)
}

if (!require("RCurl")) {
   install.packages("RCurl", dependencies = TRUE)
   library(RCurl)
}

if (!require("plotly")) {
   install.packages("plotly", dependencies = TRUE)
   library(plotly)
}
if (!require("leaflet")) {
    install.packages("leaflet")              
    library("leaflet")
}
if (!require("tmap")) {
    install.packages("tmap")              
    library("tmap")
}

knitr::opts_chunk$set(echo = TRUE,       
                      warning = FALSE,   
                      result = TRUE,   
                      message = FALSE,
                      comment = NA)
```


# Introduction - Gas Station Data
In this analysis, we will be looking at gas station data which consists of 72,798 records and 30 feature variables with 1 binary outcome variable. A map was created based on a random sample of 500 gas stations in the US.


# Data Preparation

Importing the data set:
```{r}
poc<-read.csv(url("https://melaniemiller1.github.io/STA533/w07-HW/POC.csv"))
```


Take random sample of 500 gas stations in the US:
```{r}
sample500<- poc[sample(nrow(poc), 500), ]
```


# Gas Stations in the US - leaflet
```{r}
label.msg <- ~paste(sample500$ADDRESS,",", sample500$county, sample500$STATE,",",sample500$ZIPnew)

leaflet(sample500) %>%
  addTiles() %>% 
  setView(lng=mean(sample500$xcoord), lat=mean(sample500$ycoord), zoom = 14) %>%
   addRectangles(
    lng1 = min(sample500$xcoord), lat1 = min(sample500$ycoord),
    lng2 = max(sample500$xcoord), lat2 = max(sample500$ycoord),
    fillColor = "transparent" 
    ) %>%
  fitBounds(
    lng1 = min(sample500$xcoord), lat1 = min(sample500$ycoord),
    lng2 = max(sample500$xcoord), lat2 = max(sample500$ycoord) ) %>%
  addMarkers(~xcoord, ~ycoord, label = label.msg)
```

From this plot of a random sample of gas stations in the US, the sample shows that there is a higher proportion of gas stations in the east coast. There is a lot of sparse space in the midwest, while there is hardly any areas on the east coast without any gas stations. 


# Gas Stations in the US - plotly
```{r}
#library(plotly)

g <- list(      scope = 'usa',
           projection = list(type = 'albers usa'),
             showland = TRUE,
            landcolor = toRGB("gray95"),
         subunitcolor = toRGB("gray85"),
         countrycolor = toRGB("gray85"),
         countrywidth = 0.5,
         subunitwidth = 0.5
       )
###
fig <- plot_geo(sample500, lat = ~ycoord, lon = ~xcoord) %>% 
  add_markers( text = ~paste(STATE, county, ADDRESS, ZIPnew, 
                             sep = "<br>"),
  
              hoverinfo = "text")   %>% 

  layout( title = 'Gas Stations in the United States', 
          geo = g )

fig
```

Here is another example of plotting the same data in a different mapping format.



# Philly Crime Data
In this analysis, we will be looking at Philly Crime data which consists of 15,520 records and 18 variables and contains crime cases since 2015.A map was created based on a subset of data only in 2023.


# Data Preparation
Importing Philly Crime Since 2015 data
```{r}
phillycrime<-read.csv(url("https://melaniemiller1.github.io/STA533/w07-HW/PhillyCrimeSince2015.csv"))
```


Extract information of year from the variable date and then add the new variable year to the data set
```{r}
year<- format(as.Date(phillycrime$date, format="%m/%d/%Y"),"%Y")

phillycrime<- cbind(phillycrime, year)
```


Subset only containing 2023 data
```{r}
philly2023<- phillycrime %>%
             filter(year == 2023)
```


# Map of Crime in Philly in 2023

```{r}
pal <- c("orange", "navy")
pal[which(philly2023$fatal=="Nonfatal")] <- "navy"
pal[which(philly2023$fatal=="Fatal")] <- "orange"

label.msg <- paste("Neighborhood:", philly2023$neighborhood,
                   "<br> Race:",philly2023$race)



leaflet(philly2023) %>% 
  addTiles() %>%
  setView(lng=-75.1652, lat=39.9526, zoom = 10.5) %>%
  
    addCircleMarkers(
            ~lng, 
            ~lat,
            color = pal,
            stroke = FALSE, 
            fillOpacity = 0.4,
            label = ~paste("Neighborhood:", neighborhood, 
                           "Sex:", sex, 
                           "Race:", race, 
                           "Age:", age))  %>%

  addLegend(position = "bottomright", 
            colors = c("orange", "navy"),
            labels= c("Fatal", "Nonfatal"),
            title= "Fatal",
            opacity = 0.4)  


```


This map shows the crime in Philly in 2023. The navy points are nonfatal crimes, while the orange are fatal crimes. BY looking at the map visually, it appears there are more nonfatal crimes in 2023, however there are still many prominent fatal crimes. 


















